terminal time
Is Pontryagin's Maximum Principle all you need? Solving optimal control problems with PMP-inspired neural networks
Kamtue, Kawisorn, Moura, Jose M. F., Sangpetch, Orathai
Calculus of Variations is the mathematics of functional optimization, i.e., when the solutions are functions over a time interval. This is particularly important when the time interval is unknown like in minimum-time control problems, so that forward in time solutions are not possible. Calculus of Variations offers a robust framework for learning optimal control and inference. How can this framework be leveraged to design neural networks to solve challenges in control and inference? We propose the Pontryagin's Maximum Principle Neural Network (PMP-net) that is tailored to estimate control and inference solutions, in accordance with the necessary conditions outlined by Pontryagin's Maximum Principle. We assess PMP-net on two classic optimal control and inference problems: optimal linear filtering and minimum-time control. Our findings indicate that PMP-net can be effectively trained in an unsupervised manner to solve these problems without the need for ground-truth data, successfully deriving the classical "Kalman filter" and "bang-bang" control solution. This establishes a new approach for addressing general, possibly yet unsolved, optimal control problems.
Deep Learning Methods for S Shaped Utility Maximisation with a Random Reference Point
We consider the portfolio optimisation problem where the terminal function is an S-shaped utility applied at the difference between the wealth and a random benchmark process. We develop several numerical methods for solving the problem using deep learning and duality methods. We use deep learning methods to solve the associated Hamilton-Jacobi-Bellman equation for both the primal and dual problems, and the adjoint equation arising from the stochastic maximum principle. We compare the solution of this non-concave problem to that of concavified utility, a random function depending on the benchmark, in both complete and incomplete markets. We give some numerical results for power and log utilities to show the accuracy of the suggested algorithms.
Learning Free Terminal Time Optimal Closed-loop Control of Manipulators
Hu, Wei, Zhao, Yue, E, Weinan, Han, Jiequn, Long, Jihao
This paper presents a novel approach to learning free terminal time closed-loop control for robotic manipulation tasks, enabling dynamic adjustment of task duration and control inputs to enhance performance. We extend the supervised learning approach, namely solving selected optimal open-loop problems and utilizing them as training data for a policy network, to the free terminal time scenario. Three main challenges are addressed in this extension. First, we introduce a marching scheme that enhances the solution quality and increases the success rate of the open-loop solver by gradually refining time discretization. Second, we extend the QRnet in Nakamura-Zimmerer et al. (2021b) to the free terminal time setting to address discontinuity and improve stability at the terminal state. Third, we present a more automated version of the initial value problem (IVP) enhanced sampling method from previous work (Zhang et al., 2022) to adaptively update the training dataset, significantly improving its quality. By integrating these techniques, we develop a closed-loop policy that operates effectively over a broad domain with varying optimal time durations, achieving near globally optimal total costs.
Multi-Agent Shape Control with Optimal Transport
Lin, Alex Tong, Osher, Stanley J.
Optimal control seeks to find the best policy for an agent that optimizes a certain criterion. This general formulation allows optimal control theory to be applied in numerous areas such as robotics, finance, aeronautics, and many other fields. Inherently, optimal control optimizes the control of a single agent, but in recent years, extending optimal control problems to the realm of multi-agents has been a popular trend. Indeed, there are numerous cases where we want to model not just a single agent, but many, e.g. a fleet of drones. Here we introduce MASCOT: Multi-Agent Shape Control with Optimal Transport, a method to compute solutions to multi-agent optimal control problems that involve shape, formation, or density constraints among the agents. These constraints can be formulated in the running cost of the agents, or as a terminal cost, or even both. We first introduce the reader to optimal control and its multi-agent version. We then review the idea of optimal transport and Earth Mover's Distance. Finally, we demonstrate the method on some examples.